In this notebook, a template is provided for you to implement your functionality in stages which is required to successfully complete this project. If additional code is required that cannot be included in the notebook, be sure that the Python code is successfully imported and included in your submission, if necessary. Sections that begin with 'Implementation' in the header indicate where you should begin your implementation for your project. Note that some sections of implementation are optional, and will be marked with 'Optional' in the header.
In addition to implementing code, there will be questions that you must answer which relate to the project and your implementation. Each section where you will answer a question is preceded by a 'Question' header. Carefully read each question and provide thorough answers in the following text boxes that begin with 'Answer:'. Your project submission will be evaluated based on your answers to each of the questions and the implementation you provide.
Note: Code and Markdown cells can be executed using the Shift + Enter keyboard shortcut. In addition, Markdown cells can be edited by typically double-clicking the cell to enter edit mode.
import cv2
import numpy as np
import scipy.ndimage as ndimage
import matplotlib.pyplot as plt
from skimage import io
import tensorflow as tf
from tensorflow.contrib.layers import flatten
# Load pickled data
import pickle
print("Loading Data...")
# TODO: Fill this in based on where you saved the training and testing data
training_file = './train.p'
testing_file = './test.p'
with open(training_file, mode='rb') as f:
train = pickle.load(f)
with open(testing_file, mode='rb') as f:
test = pickle.load(f)
train_features, train_labels = train['features'], train['labels']
test_features, test_labels = test['features'], test['labels']
print("Loading Complete.")
The pickled data is a dictionary with 4 key/value pairs:
'features' is a 4D array containing raw pixel data of the traffic sign images, (num examples, width, height, channels).'labels' is a 2D array containing the label/class id of the traffic sign. The file signnames.csv contains id -> name mappings for each id.'sizes' is a list containing tuples, (width, height) representing the the original width and height the image.'coords' is a list containing tuples, (x1, y1, x2, y2) representing coordinates of a bounding box around the sign in the image. THESE COORDINATES ASSUME THE ORIGINAL IMAGE. THE PICKLED DATA CONTAINS RESIZED VERSIONS (32 by 32) OF THESE IMAGESComplete the basic data summary below.
### Replace each question mark with the appropriate value.
# TODO: Number of training examples
n_train = len(train_features)
# TODO: Number of testing examples.
n_test = len(test_features)
# TODO: What's the shape of an traffic sign image?
image_shape = train_features[0].shape
# TODO: How many unique classes/labels there are in the dataset.
n_classes = max(train_labels) + 1
print("Number of training examples =", n_train)
print("Number of testing examples =", n_test)
print("Image data shape =", image_shape)
print("Number of classes =", n_classes)
Visualize the German Traffic Signs Dataset using the pickled file(s). This is open ended, suggestions include: plotting traffic sign images, plotting the count of each sign, etc.
The Matplotlib examples and gallery pages are a great resource for doing visualizations in Python.
NOTE: It's recommended you start with something simple first. If you wish to do more, come back to it after you've completed the rest of the sections.
### Data exploration visualization goes here.
### Feel free to use as many code cells as needed.
print("Visualizing Data...")
train_features = np.array(train['features'])
train_labels = np.array(train['labels'])
input_counts = np.bincount(train_labels)
max_input = np.max(input_counts)
min_input = np.min(input_counts)
print ("The maximum inputs per class is:", max_input)
print ("The minimum inputs per class is:", min_input)
print(sum(input_counts))
### Visualize training data ###
figure1 = plt.figure()
a = figure1.add_subplot(111)
a.set_title('Distribution of Training Samples')
a.set_xlabel('Class')
a.set_ylabel('Number of Samples')
a.bar(range(len(input_counts)), input_counts, 1, color='orange')
plt.show()
### Visualize training images with labels ###
for i in range(n_classes):
for j in range(len(train_labels)):
if (i == train_labels[j]):
print('Class: ', i)
plt.imshow(train_features[j])
plt.show()
break
print("Data Visualization Complete.")
Design and implement a deep learning model that learns to recognize traffic signs. Train and test your model on the German Traffic Sign Dataset.
There are various aspects to consider when thinking about this problem:
Here is an example of a published baseline model on this problem. It's not required to be familiar with the approach used in the paper but, it's good practice to try to read papers like these.
NOTE: The LeNet-5 implementation shown in the classroom at the end of the CNN lesson is a solid starting point. You'll have to change the number of classes and possibly the preprocessing, but aside from that it's plug and play!
### Generate data additional data (OPTIONAL!)
### and split the data into training/validation/testing sets here.
### Feel free to use as many code cells as needed.
### Preprocess the data here.
### Feel free to use as many code cells as needed.
### Create additional images to compensate for the uneven class distribution ###
print("Generating additional (rotated) images...")
angles = list(range(10, -10, -1))
for i in range(len(input_counts)):
input_class_ratio = min(int(max_input / input_counts[i]) - 1, len(angles) - 1)
if input_class_ratio <= 1:
continue
new_train_features = []
new_train_labels = []
train_mask1 = np.where(train_labels == i)
for j in range(input_class_ratio):
for feature in train_features[train_mask1]:
new_train_features.append(ndimage.rotate(feature, angles[j], reshape=False))
new_train_labels.append(i)
train_features = np.append(train_features, new_train_features, axis=0)
train_labels = np.append(train_labels, new_train_labels, axis=0)
print('Additional data generation complete.')
print('New training set size with rotated images: ', len(train_features))
def class_distribution(y, n_classes):
labels = np.array(range(0, n_classes))
n_examples = np.zeros(n_classes, dtype=np.int32)
for i in labels:
# Get indices from y for training and test sets
n_examples[i] = sum(y == i)
return (labels, n_examples)
### Transformation Functions ###
# Image trasform 1 (squeeze right)
def transform_1(image):
p1 = np.float32([[0,0],[32,0],[0,32],[32,32]])
p2 = np.float32([[0,0],[32,5],[0,32],[32,27]])
orig = cv2.getPerspectiveTransform(p1, p2)
new = cv2.warpPerspective(image,orig,(32,32))
return new
# Image trasform 2 (squeeze left)
def transform_2(image):
p1 = np.float32([[0,0],[32,0],[0,32],[32,32]])
p2 = np.float32([[0,5],[32,0],[0,27],[32,32]])
orig = cv2.getPerspectiveTransform(p1, p2)
new = cv2.warpPerspective(image,orig,(32,32))
return new
# Image transform 3 (stretch image)
def transform_3(image):
p1 = np.float32([[0,0],[32,0],[0,32],[32,32]])
p2 = np.float32([[0,0],[27,5],[5,32],[32,27]])
orig = cv2.getPerspectiveTransform(p1, p2)
new = cv2.warpPerspective(image,orig,(32,32))
return new
# Image transform 4 (stretch image 2)
def transform_4(image):
p1 = np.float32([[0,0],[32,0],[0,32],[32,32]])
p2 = np.float32([[5,5],[32,0],[0,27],[27,32]])
orig = cv2.getPerspectiveTransform(p1, p2)
new = cv2.warpPerspective(image,orig,(32,32))
return new
# Use random number to decide which image transformation to use
def new_image(image):
transform_number = np.random.randint(0, 4)
if transform_number == 0:
return transform_1(image)
elif transform_number == 1:
return transform_2(image)
elif transform_number == 2:
return transform_3(image)
else:
return transform_4(image)
def generate_data(X, y, orig_class_distribution, multiplication_factor = 4):
X_new = []
y_new = []
# Get minimum final size of new training set, as well as number of images per class
final_size = multiplication_factor * X.shape[0]
n_classes = len(orig_class_distribution)
n_images_per_class = int(np.ceil(final_size / n_classes))
for i in range(X.shape[0]):
# Compute number of required images
class_idx = y[i]
current_distribution_index = orig_class_distribution[class_idx]
n_new_images = int(np.ceil((n_images_per_class - current_distribution_index) / current_distribution_index))
# Create new images with transformation functions
for j in range(n_new_images):
X_new.append(new_image(X[i]))
y_new.append(y[i])
return (np.array(X_new), np.array(y_new))
def add_transform_data(X, y, orig_class_distribution):
X_new, y_new = generate_data(X, y, orig_class_distribution)
X = np.concatenate((X, X_new), axis = 0)
y = np.concatenate((y, y_new), axis = 0)
return (X, y)
print("Generating additional (transformed) images...")
### Use transformations to create even more data ###
_, old_distribution = class_distribution(train_labels, n_classes)
train_features, train_labels = add_transform_data(train_features, train_labels, old_distribution)
print('Additional data generation complete.')
print('New training set size with transformed and rotated images: ', len(train_features))
### Visualize updated class distribution after adding more data ###
input_counts = np.bincount(train_labels)
figure2 = plt.figure()
ax = figure2.add_subplot(111)
ax.set_title('Number of inputs per class w/ Additional Data')
ax.set_xlabel('Class')
ax.set_ylabel('Number of Inputs')
ax.bar(range(len(input_counts)), input_counts, 1, color='green')
plt.show()
'''### Convert the the images to grayscale ###
print('Converting images to Grayscale...')
train_features = [cv2.cvtColor(train_features[n,:,:,:], cv2.COLOR_BGR2GRAY)
for n in range(np.shape(train_features)[0])]
test_features = [cv2.cvtColor(test_features[n,:,:,:], cv2.COLOR_BGR2GRAY)
for n in range(np.shape(test_features)[0])]
train_features = np.reshape(train_features, (np.shape(train_features)[0],32,32,1))
test_features = np.reshape(test_features, (np.shape(test_features)[0],32,32,1))
print('Grayscale Conversion Complete.')'''
### Sample grayscale image ###
'''plt.imshow(train_features[200000], cmap='gray')
plt.title('Sample Gray Image')
plt.show()'''
### Sharpen images to try to optimize features for training ###
'''
print("Sharpening images...")
print("Sharpening images...")
#sharp_image = scipy.misc.imfilter(train_features, 'sharpen')
blurred_image = ndimage.gaussian_filter(train_features, 0)
blurred_image_filter = ndimage.gaussian_filter(blurred_image, 0.2)
alpha = 255
sharpened_image1 = blurred_image + alpha * (blurred_image - blurred_image_filter)
sharpened_image2 = sharpened_image1 + alpha * (sharpened_image1 - blurred_image_filter)
sharpened_image3 = sharpened_image2 + alpha * (sharpened_image2 - blurred_image_filter)
sharpened_image4 = sharpened_image3 + alpha * (sharpened_image3 - blurred_image_filter)
sharpened_image5 = sharpened_image4 + alpha * (sharpened_image4 - blurred_image_filter)
sharpened_image6 = sharpened_image5 + alpha * (sharpened_image5 - blurred_image_filter)
print("Sharpening Complete.")'''
### Normalize training and test features ###
print('Normalizing features...')
train_features = train_features / 255.
test_features = test_features / 255.
print('Normalizing complete.')
Use the code cell (or multiple code cells, if necessary) to implement the first step of your project. Once you have completed your implementation and are satisfied with the results, be sure to thoroughly answer the questions that follow.
### Get randomized datasets for training and validation ###
print('Randomizing datasets...')
from sklearn.model_selection import train_test_split
train_features, valid_features, train_labels, valid_labels = train_test_split(
train_features,
train_labels,
test_size=0.2,
random_state=121
)
print("Randomized datasets complete.")
Describe how you preprocessed the data. Why did you choose that technique?
Answer:
After looking at a few of the training examples and plotting the training data I decided to focus on three main aspects for the image preprocessing. The first thing I noticed about the training data was the very uneven class distribution. Only about 7 or 8 classes had a large number of samples (around 2000) and most of the remaining classes had a much lower sample size. I decided to create additional samples for the classes that were underrepresented by rotating the original samples by one degree increments. I didn't want to rotate the images too much because I didn't want to affect the orientation of the road signs which would could harm the training process. After adding the rotated images I decided to create more images by "squeezing" and "stretching" the images using images transforms. The end result was a very even class distribution and a much larger data set (about 9 times larger) then the original 39,209 images.
After I created additional samples I tried to sharpen the images in an attempt to extract more features. This didn't seem to improve the training. I think this might be due to the fact that the images were low resolution. I also tried to convert the images to greyscale before training. The training process was much more efficient but the training/test accuracy was slightly less compared to using the color images. Therefore, I decided not to blur or convert the images to greyscale because both would result in less features for the neural network to train with. I also kept the color images because I think it is very important to recognizing/classifying road signs and I didn't want to remove those features from the training.
The final step in my preprocess stage was to normalize the image pixel values in order to reduce the image variability. This seemed to help quite a bit with making the training more efficient.
Final Preprocessing Method
I decided to preprocess the images with the following two step process:
Describe how you set up the training, validation and testing data for your model. Optional: If you generated additional data, how did you generate the data? Why did you generate the data? What are the differences in the new dataset (with generated data) from the original dataset?
Answer:
I setup the training, validation and testing data sets using a standard cross validation approach where 20% was saved for validation and 80% of the data was saved for training. Also, the original testing data set was unmodified.
I decided to create additional samples by rotating the original images in one degree increments from 10 to -10 degrees and also transforming the images in four different ways. I decided to generate additional data because the original data set had a very uneven distribution of sample images per class. Some classes had around 2000 samples while others only had around 200 samples. The difference between the new data set and the original data set is the overall sample distribution is much larger and more uniform after the rotated/transformed images were added to the original data set.
### Define your architecture here.
### Feel free to use as many code cells as needed.
EPOCHS = 20
BATCH_SIZE = 150
def LeNet(x):
# Hyperparameters
mu = 0
sigma = 0.1
# Layer 1: Convolutional. Input = 32x32x3. Output = 28x28x6.
conv1_W = tf.Variable(tf.truncated_normal(shape=(5, 5, 3, 6), mean = mu, stddev = sigma))
conv1_b = tf.Variable(tf.zeros(6))
conv1 = tf.nn.conv2d(x, conv1_W, strides=[1, 1, 1, 1], padding='VALID') + conv1_b
# 1st activation
conv1 = tf.nn.relu(conv1)
# 1st max pooling layer, Input = 28x28x6. Output = 14x14x6.
conv1 = tf.nn.max_pool(conv1, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='VALID')
# Layer 2: Convolutional. Output = 10x10x16.
conv2_W = tf.Variable(tf.truncated_normal(shape=(5, 5, 6, 16), mean = mu, stddev = sigma))
conv2_b = tf.Variable(tf.zeros(16))
conv2 = tf.nn.conv2d(conv1, conv2_W, strides=[1, 1, 1, 1], padding='VALID') + conv2_b
# 2nd activation.
conv2 = tf.nn.relu(conv2)
# 2nd max Pooling. Input = 10x10x16. Output = 5x5x16.
conv2 = tf.nn.max_pool(conv2, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='VALID')
# Flatten. Input = 5x5x16. Output = 400.
fc0 = flatten(conv2)
# Layer 3: Fully Connected. Input = 400. Output = 120.
fc1_W = tf.Variable(tf.truncated_normal(shape=(400, 120), mean = mu, stddev = sigma))
fc1_b = tf.Variable(tf.zeros(120))
fc1 = tf.matmul(fc0, fc1_W) + fc1_b
# 3rd activation.
fc1 = tf.nn.relu(fc1)
# Layer 4: Fully Connected. Input = 120. Output = 84.
fc2_W = tf.Variable(tf.truncated_normal(shape=(120, 84), mean = mu, stddev = sigma))
fc2_b = tf.Variable(tf.zeros(84))
fc2 = tf.matmul(fc1, fc2_W) + fc2_b
# 4th activation.
fc2 = tf.nn.relu(fc2)
# Layer 5: Fully Connected. Input = 84. Output = 43.
fc3_W = tf.Variable(tf.truncated_normal(shape=(84, 43), mean = mu, stddev = sigma))
fc3_b = tf.Variable(tf.zeros(43))
logits = tf.matmul(fc2, fc3_W) + fc3_b
return logits
### Features and Labels ###
### x is a placeholder for a batch of input images
### y is a placeholder for a batch of labels
x = tf.placeholder(tf.float32, (None, 32, 32, 3))
y = tf.placeholder(tf.int32, (None))
one_hot_y = tf.one_hot(y, 43)
What does your final architecture look like? (Type of model, layers, sizes, connectivity, etc.) For reference on how to build a deep neural network using TensorFlow, see Deep Neural Network in TensorFlow from the classroom.
Answer:
My architecture is directly based on the LeNet convolutional neural network from lesson 9. The original LeNet network performed very well after more data was introduced and normalized. The only modifications made were the original input dimensions and final ouput dimension.
Network Layout:
### Train your model here.
### Feel free to use as many code cells as needed.
### Training Pipeline ###
rate = 0.001
logits = LeNet(x)
cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits, one_hot_y)
loss_operation = tf.reduce_mean(cross_entropy)
optimizer = tf.train.AdamOptimizer(learning_rate = rate)
training_operation = optimizer.minimize(loss_operation)
### Model Evaluation ###
correct_prediction = tf.equal(tf.argmax(logits, 1), tf.argmax(one_hot_y, 1))
accuracy_operation = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
saver = tf.train.Saver()
def evaluate(X_data, y_data):
num_examples = len(X_data)
total_accuracy = 0
sess = tf.get_default_session()
for offset in range(0, num_examples, BATCH_SIZE):
batch_x, batch_y = X_data[offset:offset+BATCH_SIZE], y_data[offset:offset+BATCH_SIZE]
accuracy = sess.run(accuracy_operation, feed_dict={x: batch_x, y: batch_y})
total_accuracy += (accuracy * len(batch_x))
return total_accuracy / num_examples
### Train Model ###
from sklearn.utils import shuffle
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
num_examples = len(train_features)
print("Training...")
print()
for i in range(EPOCHS):
train_features, train_labels = shuffle(train_features, train_labels)
for offset in range(0, num_examples, BATCH_SIZE):
end = offset + BATCH_SIZE
batch_x, batch_y = train_features[offset:end], train_labels[offset:end]
sess.run(training_operation, feed_dict={x: batch_x, y: batch_y})
validation_accuracy = evaluate(valid_features, valid_labels)
print("EPOCH {} ...".format(i+1))
print("Validation Accuracy = {:.3f}".format(validation_accuracy))
print()
saver.save(sess, '/Users/Sean/CarND-Term1-Starter-Kit/model_4.ckpt')
print("Model saved")
### Evaluate Model on Test data ###
with tf.Session() as session:
saver.restore(session, '/Users/Sean/CarND-Term1-Starter-Kit/model_4.ckpt')
print('Model restored with latest weights')
test_accuracy = evaluate(test_features, test_labels)
print("Test Accuracy = {:.3f}".format(test_accuracy))
How did you train your model? (Type of optimizer, batch size, epochs, hyperparameters, etc.)
Answer:
Optimizer: Adam Optimizer
Batch size: 150
Epochs: 20
I tried a few different learning rates but 0.001 seemed to work very well and I didn't have any issues with getting stuck in local minima. I also left the truncated normal distribution mean equal to 0 and the standard deviation equal to 0.1. Bias was always initialized to zero.
I used the trial and error method to train the model and I used my local CPU so I tried to keep the batch size and number of epochs relatively low.
What approach did you take in coming up with a solution to this problem? It may have been a process of trial and error, in which case, outline the steps you took to get to the final solution and why you chose those steps. Perhaps your solution involved an already well known implementation or architecture. In this case, discuss why you think this is suitable for the current problem.
Answer:
I began training the data using the LeNet with the intention to try other network architectures but since LeNet seemed to work so well from the beginning I decided to continue to use it.
Most of my time was spent deciding what the best method was to preprocess the data. After the data was finished being preprocessed then I focused on modifying the patch sizes, strides, batch sizes, etc. through the process of trial and error. In the end I decided to use all the original hyperparameters since they seemed to produce the best results.
I believe the LeNet model worked very well in it's original form because the training images were all nicely formated (centered, cropped, close-up) and this made it easy for the LeNet model to classify the images with considerable accuracy. This makes sense since LeNet was originally developed to classify numbers by training on unprocessed numerical images. The model didn't have to search a huge image just to pick out a small traffic sign on the edge of an image.
Take several pictures of traffic signs that you find on the web or around you (at least five), and run them through your classifier on your computer to produce example results. The classifier might not recognize some local signs but it could prove interesting nonetheless.
You may find signnames.csv useful as it contains mappings from the class id (integer) to the actual sign name.
Use the code cell (or multiple code cells, if necessary) to implement the first step of your project. Once you have completed your implementation and are satisfied with the results, be sure to thoroughly answer the questions that follow.
### Load the images and plot them here.
### Feel free to use as many code cells as needed.
### Import and plot 10 test images which were collected from the internet ###
test_imgs = np.uint8(np.zeros((10,32,32,3)))
for i in range(1, 11):
image = io.imread('/Users/Sean/CarND-Term1-Starter-Kit/test_images/pic{}.jpg'.format(str(i)))
test_imgs[i-1] = image
test_img_data = test_imgs.reshape((10, 32, 32, 3)).astype(np.float32)
new_images = []
for i in range(0, 10):
print('Test Image: ', i+1)
new_images.append(test_imgs[i])
plt.imshow(new_images[i])
plt.show()
print(np.shape(new_images))
#test_imgs = np.reshape(test_imgs, (tf.float32, (None, 32, 32, 3))
Choose five candidate images of traffic signs and provide them in the report. Are there any particular qualities of the image(s) that might make classification difficult? It could be helpful to plot the images in the notebook.
f, axarr = plt.subplots(5, 1)
for i in range(5):
axarr[i].imshow(test_imgs[i])
plt.setp(axarr[i].get_xticklabels(), visible=False)
plt.setp(axarr[i].get_yticklabels(), visible=False)
plt.show()
Answer:
Image 1: The U.S. "Do Not Enter" sign is slightly different than the German version in the training set so it might not be able to classify it correctly.
Image 2: The "Deer Crossing" sign is shaped diiferent than the German sign in the training set but the sign image is very similar so it may or may not classify it correctly.
Image 3: The "Pedestrian Crossing" sign should not be a problem for the model to classify correctly since it is present in the training set.
Image 4: The image and shape of the U.S. "Slippery Road" sign is very different than the German version, which is in the training set, so the model will probably not classify it correctly.
Image 5: The U.S. "No U-Turn" sign isn't present in the training set so the model will most likely have difficulty classifying it.
Is your model able to perform equally well on captured pictures when compared to testing on the dataset? The simplest way to do this check the accuracy of the predictions. For example, if the model predicted 1 out of 5 signs correctly, it's 20% accurate.
NOTE: You could check the accuracy manually by using signnames.csv (same directory). This file has a mapping from the class id (0-42) to the corresponding sign name. So, you could take the class id the model outputs, lookup the name in signnames.csv and see if it matches the sign from the image.
Answer:
I used my model to make predictions for the five captured images pictured above. The resulting model prediction accuracy for the captured images was 20%, however, the model prediction accuracy on the training set was 93%.
Therefore, I believe my model did not perform well in this real world situation. One reason would be because four of the captured images were not present in the training set. These four images were U.S. road signs instead of the German road signs, which is what the model was trained with. The U.S. road signs look very different than the German version of the same sign and the model did not have any prior experience with recognizing the U.S. road sign features.
Another reason that my model did not perform well is because the captured images were not taken from the real world. The captured images were computer generated and the background was blank white. This may have thrown off my model and caused such low prediction accuracy for the captured images.
Image 1 "No Entry"
Model Prediction: (17) "No Entry" - Correct Prediction
Image 2 "Wild Animals Crossing"
Model Prediction: (10) "No Passing for Vehicles Over 3.5 Metric Tons" - Incorrect Prediction
Image 3 "Pedestrians"
Model Prediction: (24) "Road Narrows on Right" - Incorrect Prediction
Image 4 "Slippery Road"
Model Prediction: (17) "No Entry" - Incorrect Prediction
Image 5 "No U-Turn"
Model Prediction: (12) "Priority Road" - Incorrect Prediction
import tensorflow as tf
with tf.Session() as sess:
saver.restore(sess, '/Users/Sean/CarND-Term1-Starter-Kit/model_4.ckpt')
print('Model restored with latest weights')
### model evaluation on test images collected from the internet ###
prediction = tf.argmax(logits, 1)
test_prediction = sess.run(prediction, feed_dict={x: test_img_data})
print('Test Image Predictions: ', test_prediction)
### Use Softmax and top_K functions to determine prediction probabilites ###
import tensorflow as tf
with tf.Session() as sess:
saver.restore(sess, '/Users/Sean/CarND-Term1-Starter-Kit/model_4.ckpt')
print('Model restored with latest weights')
prediction = tf.nn.softmax(logits)
topFive=tf.nn.top_k(prediction, k=5, sorted=True, name=None)
top_k_feed_dict = {x: test_img_data}
print('Softmax/Top_K Results (values and indices): ', sess.run(topFive, feed_dict = top_k_feed_dict))
### Store Values and Indices ###
with tf.Session() as session:
saver.restore(session, '/Users/Sean/CarND-Term1-Starter-Kit/model_4.ckpt')
print('Model restored with latest weights')
top_k_probabilities_per_image = session.run(topFive, feed_dict=top_k_feed_dict)
values = np.array([top_k_probabilities_per_image.values])
indices = np.array([top_k_probabilities_per_image.indices])
### Visualize the softmax probabilities here ###
def plot_top_k_probabilities(pred_cls, pred_prob, title):
plt.plot(pred_cls[0], pred_prob[0], 'go')
x1,x2,y1,y2 = plt.axis()
plt.ylim(0,1.1)
plt.xlim(-1, 45)
plt.ylabel('Probability')
plt.xlabel('Predicted Class')
plt.title('Test Image 1 Prediction Certainty')
plt.show()
plt.plot(pred_cls[1], pred_prob[1], 'go')
x1,x2,y1,y2 = plt.axis()
plt.ylim(0,1.1)
plt.xlim(-1, 45)
plt.ylabel('Probability')
plt.xlabel('Predicted Class')
plt.title('Test Image 2 Prediction Certainty')
plt.show()
plt.plot(pred_cls[2], pred_prob[2], 'go')
x1,x2,y1,y2 = plt.axis()
plt.ylim(0,1.1)
plt.xlim(-1, 45)
plt.ylabel('Probability')
plt.xlabel('Predicted Class')
plt.title('Test Image 3 Prediction Certainty')
plt.show()
plt.plot(pred_cls[3], pred_prob[3], 'go')
x1,x2,y1,y2 = plt.axis()
plt.ylim(0,1.1)
plt.xlim(-1, 45)
plt.ylabel('Probability')
plt.xlabel('Predicted Class')
plt.title('Test Image 4 Prediction Certainty')
plt.show()
plt.plot(pred_cls[4], pred_prob[4], 'go')
x1,x2,y1,y2 = plt.axis()
plt.ylim(0,1.1)
plt.xlim(-1, 45)
plt.ylabel('Probability')
plt.xlabel('Predicted Class')
plt.title('Test Image 5 Prediction Certainty')
plt.show()
for i in range(len(values)):
#predicted_class = indices[0][np.argmax(values[0])]
correct_class = np.argmax(train_labels[i])
plot_top_k_probabilities(indices[i], values[i], '')
Use the model's softmax probabilities to visualize the certainty of its predictions, tf.nn.top_k could prove helpful here. Which predictions is the model certain of? Uncertain? If the model was incorrect in its initial prediction, does the correct prediction appear in the top k? (k should be 5 at most)
Answer:
The model is 100% certain of every prediction because I used one hot encoding to predict the traffic sign classes.
The model only predicted the first image correctly. The remaining incorrect predictions did not include the correct class in the top 5 predictions.
Note: Once you have completed all of the code implementations and successfully answered each question above, you may finalize your work by exporting the iPython Notebook as an HTML document. You can do this by using the menu above and navigating to \n", "File -> Download as -> HTML (.html). Include the finished document along with this notebook as your submission.